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Abstract.  This  paper  describes  the  participation  of  the  SNUMedinfo  team  at  the 
TREC  Web  track  2014.  This  is  the  first  time  we  participate  in  the  Web  track. 
Rather  than  applying  more  sophisticated  retrieval  method  such  as  learning  to  rank 
models,  this  year  we  used  only  baseline  retrieval  models  with  spam  filtering  and 
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1.  Introduction 

In  this  paper,  we  describe  the  methods  in  participation  of  the  SNUMedinfo  team  at 
the  TREC  Web  track  2014.  For  a  detailed  task  introduction,  please  see  the  overview 
paper  of  this  track. 

2.  Methods 

We  used  sequential  dependence  model  (SDM)  [1]  as  a  baseline  retrieval  model.  For 
the  experiment,  we  used  batch  query  service  offered  by  lemur  project  website  [2],  Clue- 
Webl2-Full  dataset  is  our  test  corpus.  Waterloo  spam  filter  [3]  is  used  to  filter  out  spam 
documents.  Details  of  our  submitted  runs  can  be  summarized  as  following  table. 


Table  1.  Submitted  runs 


RunID 

Method  description 

SNUMedinfo  1 1 

SDM 

SNUMedinfo  12 

SDM  +  Spam  filtering  (threshold:  50) 

SNUMedinfo  13 

SDM  +  Spam  filtering  (threshold:  50)  +  Pagerank  Prior  score 

SDM  :  Sequential  dependence  model 


Regarding  SNUMedinfol3,  we  used  Pagerank  Prior  [4]  scores  offered  by  lemur  project 
website. 


3.  Results 


Table  2.  Evaluation  results 


RunID 

ndcg@20 

err  @20 

SNUMedinfol  1 

0.2436 

0.1386 

SNUMedinfol2 

0.2698 

0.1759 

SNUMedinfol  3 

0.1927 

0.1230 

4.  Conclusion 

This  year,  we  submitted  baseline  retrieval  model  with  spam  filtering  and  pagerank 
prior  score.  We  plan  to  experiment  with  more  advanced  retrieval  methods  in  the  next 
year’s  participation. 
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